-
Notifications
You must be signed in to change notification settings - Fork 293
Support RHAIIS images for the e2e tests #2032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
Summary of ChangesHello @dhuangnm, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the capabilities of the e2e testing framework by integrating support for running tests against vLLM instances deployed as RHAIIS images in a CI/CD context. It introduces a more flexible way to specify test configurations, including a new 'smoke test' option for quicker runs. The changes ensure that the testing infrastructure can adapt to different vLLM deployment scenarios, from local development to containerized environments, by intelligently managing environment variables, model saving paths, and the execution of vLLM commands, ultimately enhancing the robustness and versatility of the testing pipeline. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces support for running e2e tests using RHAIIS images, which is a significant enhancement for the CI/CD pipeline. It also adds flexibility by allowing tests to be run from a configuration file list. The changes are well-structured. My review focuses on improving the robustness of the shell script handling file lists, adhering to Pythonic conventions, and refining the new logic for executing tests within a Kubernetes environment. I've suggested changes to prevent potential issues with filenames containing special characters, to improve code readability and maintainability by removing sentinel string values and duplicated code, and to ensure the kubectl command is used appropriately for non-interactive scripting.
tests/e2e/vLLM/test_vllm.py
Outdated
| with open(self.vllm_bash, "w") as cf: | ||
| cf.write(f"""#!/bin/bash | ||
| export HF_HUB_OFFLINE=0 | ||
| export VLLM_NO_USAGE_STATS=1 | ||
| {vllm_cmd} | ||
| """) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The multiline f-string will preserve the leading whitespace from the source code, resulting in a shell script with incorrect indentation. This can lead to issues when the script is executed. It's better to construct the string without the extra indentation, for example by using string literal concatenation.
| with open(self.vllm_bash, "w") as cf: | |
| cf.write(f"""#!/bin/bash | |
| export HF_HUB_OFFLINE=0 | |
| export VLLM_NO_USAGE_STATS=1 | |
| {vllm_cmd} | |
| """) | |
| with open(self.vllm_bash, "w") as cf: | |
| cf.write( | |
| "#!/bin/bash\n" | |
| "export HF_HUB_OFFLINE=0\n" | |
| "export VLLM_NO_USAGE_STATS=1\n" | |
| f"{vllm_cmd}\n" | |
| ) |
tests/e2e/vLLM/run_tests.sh
Outdated
| CONFIGS=`ls "$CONFIG"` | ||
| elif [ -f "$CONFIG" ]; then | ||
| echo "Config is provided as a file: $CONFIG" | ||
| CONFIGS=`cat "$CONFIG"` | ||
| fi | ||
| echo "$CONFIGS" | ||
|
|
||
| # Parse list of configs. | ||
| for MODEL_CONFIG in "$CONFIG"/* | ||
| for MODEL_CONFIG in $(echo -e "$CONFIGS" | sed "s|^|${script_path}/configs/|") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using ls and cat to build a list of files, and then iterating over it with an unquoted for loop is not robust. This will fail if any filenames contain spaces or other special characters due to word splitting. A safer approach would be to use a while read loop to process the configs line by line, which correctly handles such filenames.
tests/e2e/vLLM/test_vllm.py
Outdated
| RUN_SAVE_DIR=os.environ.get("RUN_SAVE_DIR", "none") | ||
| # when using vllm image, needs to save the generated model | ||
| if VLLM_PYTHON_ENV.lower() != "same" and (not Path(VLLM_PYTHON_ENV).exists()): | ||
| IS_VLLM_IMAGE = True | ||
| assert RUN_SAVE_DIR != "none", "To use vllm image, RUN_SAVE_DIR must be set!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the string "none" as a sentinel value for the RUN_SAVE_DIR environment variable is not idiomatic. It's more Pythonic to use None. You can get this behavior by calling os.environ.get("RUN_SAVE_DIR") without a default value, which returns None if the variable is not set. Then, checks can be performed with is not None. This should be applied everywhere RUN_SAVE_DIR is used (e.g., line 92).
| RUN_SAVE_DIR=os.environ.get("RUN_SAVE_DIR", "none") | |
| # when using vllm image, needs to save the generated model | |
| if VLLM_PYTHON_ENV.lower() != "same" and (not Path(VLLM_PYTHON_ENV).exists()): | |
| IS_VLLM_IMAGE = True | |
| assert RUN_SAVE_DIR != "none", "To use vllm image, RUN_SAVE_DIR must be set!" | |
| RUN_SAVE_DIR=os.environ.get("RUN_SAVE_DIR") | |
| # when using vllm image, needs to save the generated model | |
| if VLLM_PYTHON_ENV.lower() != "same" and (not Path(VLLM_PYTHON_ENV).exists()): | |
| IS_VLLM_IMAGE = True | |
| assert RUN_SAVE_DIR is not None, "To use vllm image, RUN_SAVE_DIR must be set!" |
tests/e2e/vLLM/test_vllm.py
Outdated
| if RUN_SAVE_DIR != "none": | ||
| assert Path(RUN_SAVE_DIR).exists(), f"RUN_SAVE_DIR path doesn't exist: {RUN_SAVE_DIR}" | ||
| self.run_save_dir = RUN_SAVE_DIR | ||
| # RUN_SAVE_DIR overwrites config save_dir if specified | ||
| self.save_dir = os.path.join(RUN_SAVE_DIR, self.model.split("/")[1] + f"-{self.scheme}") | ||
|
|
||
| if not self.save_dir: | ||
| self.save_dir = self.model.split("/")[1] + f"-{self.scheme}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expression to generate the save directory name is duplicated. To follow the DRY (Don't Repeat Yourself) principle, you can calculate it once and store it in a variable for reuse. This improves maintainability.
| if RUN_SAVE_DIR != "none": | |
| assert Path(RUN_SAVE_DIR).exists(), f"RUN_SAVE_DIR path doesn't exist: {RUN_SAVE_DIR}" | |
| self.run_save_dir = RUN_SAVE_DIR | |
| # RUN_SAVE_DIR overwrites config save_dir if specified | |
| self.save_dir = os.path.join(RUN_SAVE_DIR, self.model.split("/")[1] + f"-{self.scheme}") | |
| if not self.save_dir: | |
| self.save_dir = self.model.split("/")[1] + f"-{self.scheme}" | |
| save_dir_basename = self.model.split("/")[1] + f"-{self.scheme}" | |
| if RUN_SAVE_DIR != "none": | |
| assert Path(RUN_SAVE_DIR).exists(), f"RUN_SAVE_DIR path doesn't exist: {RUN_SAVE_DIR}" | |
| self.run_save_dir = RUN_SAVE_DIR | |
| # RUN_SAVE_DIR overwrites config save_dir if specified | |
| self.save_dir = os.path.join(RUN_SAVE_DIR, save_dir_basename) | |
| if not self.save_dir: | |
| self.save_dir = save_dir_basename |
tests/e2e/vLLM/test_vllm.py
Outdated
| logger.info("vllm image. Run vllm cmd with kubectl.") | ||
| result = subprocess.Popen( | ||
| [ | ||
| "kubectl", "exec", "-it", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -it flags in kubectl exec are for allocating an interactive terminal session (pseudo-TTY). This is unnecessary for a non-interactive script and can cause issues in some CI environments where a TTY is not available. It's safer to remove them as they are not needed for this use case.
| "kubectl", "exec", "-it", | |
| "kubectl", "exec", |
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
dsikka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want this separate from our normal e2e testing.
Key things to change:
- We should keep run_tests.sh as is and create a separate rhaiis image testing bash script
- The vLLM test is already written in functions which we should pull in to a separate rhaiis-e2e tets
This doesn't affect our normal e2e testing. We don't change any existing ways to run the e2e tests, just adding a new way to allow the tests run with RHAIIS images. The changes to the run_tests.sh is to allow a smoke list file be used, it still supports its current way of using the configs folder and nothing is changed there either. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do the following:
- Separate the test into two steps: Model Generation (hanlded by
set-up,run_oneshot_for_e2e_testingand uploading to the hub) following by vLLM Inference - Run the second step in either a vLLM image or using vLLM nightly. This will use the models generated from (1) and will allow separtion between upstream and midstream flows
I think most of the changes are in the vLLM side and this is also the most reflective of the actual flow we'd expect when using upstream and RHAIIS.
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Signed-off-by: Dan Huang <[email protected]>
Thanks Dipika for the suggestions. Updated the code accordingly:
Please review again. |
SUMMARY:
Add support to allow the e2e tests run with the RHAIIS images in our CI/CD infrastructure:
rhaiis-e2e-smoke.listfile that lists the models as a smoke test for the e2e tests run with the RHAIIS images.run_tests_in_rhaiis.shto run the e2e tests with the RHAIIS images.bash tests/e2e/vLLM/run_tests_in_rhaiis.sh -c tests/e2e/vLLM/e2e-smoke.list -t tests/e2e/vLLM/test_vllm.py -s <path to save results etc>A successful run for the e2e tests using the RHAIIS images is here: https://github.com/neuralmagic/llm-compressor-testing/actions/runs/19676158533
TEST PLAN:
All tests